Mining Parallel Corpora from Sina Weibo and Twitter
نویسندگان
چکیده
منابع مشابه
Mining Parallel Corpora from Sina Weibo and Twitter
Microblogs such as Twitter, Facebook, and Sina Weibo (China’s equivalent of Twitter), are a remarkable linguistic resource. In contrast to content from edited genres such as newswire, microblogs contain discussions of virtually every topic by numerous individuals in different languages and dialects and in different styles. In this work, we show that some microblog users post “self-translated” m...
متن کاملTopical differences between Chinese language Twitter and Sina Weibo
Sina Weibo, China’s most popular microblogging platform, is currently used by over 500M users and is considered to be a proxy of Chinese social life. In this study, we contrast the discussions occurring on Sina Weibo and on Chinese language Twitter in order to observe two different strands of Chinese culture: people within China who use Sina Weibo with its government imposed restrictions and th...
متن کاملApplication of Association Rule Mining Theory in Sina Weibo
A user profile contains information about a user. A substantial effort has been made so as to understand users’ behavior through analyzing their profile data. Online social networks provide an enormous amount of such information for researchers. Sina Weibo, a Twitter-like microblogging platform, has achieved a great success in China although studies on it are still in an initial state. This pap...
متن کاملA Study on Strength of Sina Weibo
With the widespread use of mobile smart device, and its ubiquitous network access capability, microblog have important influence. But to one node in microblog space, how to measure its importance is a key problem. In this paper, we propose a method to evaluate and compute the influence of one node in sina Weibo. An algorithm named MicroV is proposed to quantify the strength of one user in the m...
متن کاملHybrid Parallel Sentence Mining from Comparable Corpora
Mining for parallel sentences in comparable corpora is much more difficult than aligning sentences in parallel corpora. Sentence alignment in parallel corpora usually exploits simple empirical evidence (turned into assumptions) such as (i) the length of a sentence is proportional with the length of its translation and (ii) the discourse flow is necessarily the same in both parts of the bi-text ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computational Linguistics
سال: 2016
ISSN: 0891-2017,1530-9312
DOI: 10.1162/coli_a_00249